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*0 ■ Abstract 
ON 

qq | We derive simple empirical color-redshift relations for z < 4 galaxies in the 



Hubble Deep Field (HDF) using a linear function of three photometric colors 
(U — B, B — V, V — I). The dispersion between the estimated redshifts and the 
spectroscopically observed ones is small for relations derived in several separate 
color regimes; the dispersions range from a z ~ 0.03 to 0.1 for z < 2 galaxies, 



^ . and from a z ~ 0.14 to 0.25 for z > 2 galaxies. We apply the color-redshift 

relations to the HDF photometric catalog and obtain estimated redshifts that 
are consistent with those derived from spectral template fitting methods. The 
advantage of these color-redshift relations is that they are simple and easy to 
use and do not depend on the assumption of any particular spectral templates; 
they provide model independent redshift estimates for z < 4 galaxies using 
only multi-band photometry, and they apply to about 90% of all galaxies. We 
provide a color-based estimated redshift catalog of HDF galaxies to z < 4. We 
use the estimated redshifts to investigate the redshift distribution of galaxies 
in the HDF; we find peaks in the redshift distribution that suggest large-scale 
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clustering of galaxies to at least z ~ 1 and that are consistent with those 
identified in spectroscopic probes of the HDF. 

Subject headings: galaxies: distance and redshifts — methods: data analysis 
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1. Introduction 

The Hubble Deep Field (HDF) provides accurate multi-band photometry of galaxies 
to very faint magnitudes, with 10a AB magnitude limit of m 814 = 27.60 ( |WiIIiams ct. al. 



1996|) . The faint limit of the HDF makes it difficult to obtain spectroscopic redshifts for the 
majority of the galaxies in the field. It is therefore useful to derive estimated redshifts of 
these galaxies using the available multi-band photometry. Several groups flLanzetta et. al 



19961 |Gwyn fc Hartwick 1996| , jSawicki et. al. 1997| ) obtained photometric redshifts for the 



HDF by comparing the observed UBVI fluxes of each object with a set of galaxy spectral 
templates of different galaxy types redshifted to evenly spaced redshifts. Since spectroscopic 
redshifts have been measured and published for ~ 100 galaxies in the HDF (|Cohen et. al. 



|1996l , |Hogg et al. 19981 ; [Steidel et. al. 1996| ; |Lowenthal et. al. 19971) , it is possible to fit 
analytic expressions for photometric redshifts. In this paper, we explore a simple empirical 
approach to estimating redshifts of galaxies based on their colors (see |Connolly et. al.| 
|19981 [Brunner et al. 1997] , |Connolly et. al. 1995| for an alternative empirical approach); 



this method has the advantage of being simple, model independent (i.e., it does not depend 
on the assumption of any particular set of galaxy spectral templates), and easy to use in 
determining approximate redshifts of z < 4 galaxies. 

We determine the empirical analytic relations for color redshifts in §2. We compare 
our estimated redshifts for the HDF galaxies with those obtained by the template-fitting 
method in §3. We describe our estimated redshift catalog of HDF galaxies to z < 4 in §4 
(the Web site address of the catalog is given). We investigate the redshift clustering of HDF 
galaxies in §5, and summarize our results in §6. 
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2. Empirical Color-redshift Relations 

The HDF covers a 4 square arcmin area of sky in the northern continuous viewing zone 
( |Williams et. al. 1996Q . The HDF galaxies have been observed in four passbands: m 30Q 
(which we denote by U), m 450 (B), m 60 6 (V), and m 8 i 4 (I). We use the HDF photometric 
catalog by Sawicki et al. (1997); it contains 848 galaxies with measured fluxes in all four 
passbands to a magnitude limit of / = 27. Note that we use the AB system following 
Sawicki et al. (1997). 

In deriving the color-based estimated redshift relations for the HDF, we use 82 galaxies 
with measured spectroscopic redshifts in this field, with redshifts in the range z ~ 0.1-3.5 
flUohen et. aTTggg , |Hogg et al. Iggg ; ^teidel et. aTTggg ; |Lowenthal et. al. 1997| ). Five 



of the galaxies (with 2.8 < z < 3.5) are used with only upper limits to their U fluxes (as 
derived by Sawicki et. al. 1997). 

We first divide the galaxy sample into regions of high and low redshifts (z > 2 and 
z < 2) based on empirical color cuts as follows. For z > 2, the galaxies satisfy one of the 
following three color selection criteria: 

U > 25.66, U - B> 0.91, B-V < 1.37, V — I < 0.5; (1) 
/ > 23.5, U - B > 2.2; (2) 
/ > 23.5, B - V > 2.2, U - B > -0.5. (3) 

Lower redshift galaxies, z < 2, generally fall outside these color- magnitude regions. Note 
that the above color selection criteria for z > 2 galaxies reflect our current knowledge 
from galaxies with measured spectroscopic redshifts; they may need to be revised as new 
data becomes available. To minimize the redshift dispersion in the empirical color-redshift 
analytic fits, we further divide the regions into two color ranges for z > 2 and three color 
ranges for z < 2. These empirical divisions reflect the color shifts as a function of redshift 
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for different spectral type galaxies. For each color range, we fit an analytic relation for the 
redshift, z a , which is linear in color, 

Za = C! + c 2 (U-B) + c 3 (B-V)+ c 4 (V - /), (4) 

where q (i — 1,4) are constants. The error in z a due to photometric errors is 



Az a = V(c 2 AUf + [(c 3 - c 2 )AB] 2 + [(c 4 - c 3 )AV] 2 + (c 4 Alf. (5) 

We use Eq.(f|) to determine the best fit between the observed spectroscopic redshifts 
of 82 HDF galaxies at z ~ 0.1-3.5 and their colors, in each of the five separate color 
ranges. We find the following best-fit relations and their redshift dispersions, a z . The 
redshift dispersions are calculated using the jack knife method (see [Lupton 1993| for a 



simple description), which is also used to estimate the uncertainties of the coefficients in the 
best-fit relations. For easy reference, we assign each color range a number, cr, 1 < cr < 5. 

For the z < 2 galaxies (see photometric range discussed above): 

(i) cr=l: (U - B) < (B - V) - 0.1: (28 galaxies) 

z a = 0.4111 - 0.1852 (U - B) - 0.3062 (B-V) + 0.7301 (V - 1), a z = 0.034. (6) 

The 1-a uncertainties of the coefficients are 0.0036, 0.0058, 0.0084, and 0.0092 
respectively. 

(ii) cr=2: (U — B) > (B — V) — 0.1 > (V — I): (21 galaxies) 

z a = 0.163- 0.171 (U- B) + 0.340 (B - V) + 0.194 (V - I), a z = 0.095. (7) 
The 1-a uncertainties of the coefficients are 0.013, 0.014, 0.045, and 0.047 respectively. 

(iii) cr=3: (U — B) > (B — V) — 0.1 < (V — I): (19 galaxies) 

z a = 1.126 + 0.480 (U - B) — 0.513 (B — V) — 0.250 (V — I), a z = 0.097. (8) 
The 1-a uncertainties of the coefficients are 0.029, 0.023, 0.033, and 0.041 respectively. 
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For the z > 2 galaxies (see photometric range above, Eqs.([l])-(|3])): 

(i) cr=4: (B — V) — 0.5 > (V - I): (8 galaxies; z > 3) 

z a = 2.37 + 0.02 (U-B) + 1.Q1(B -V) - 2.47 (V - I), o z = 0.14. (9) 
The l-o" uncertainties of the coefficients are 0.16, 0.04, 0.23, and 0.40 respectively. 

(ii) cr=5: (B — V) — 0.5 < (V — I): (6 galaxies; z < 3) 

2 = 2.18 + 0.10 (U- B) + 0.20 (B-V) + 0.75 (V-/), <r z = 0.25 (10) 
The 1-a uncertainties of the coefficients are 0.14, 0.09, 0.34, and 0.77 respectively. 

We present in Fig.l the best-fit analytic redshifts z a (from Eqs.(|6[)-([T0D) versus the 
measured spectroscopic galaxy redshifts z. The galaxies with known measurement errors 
in UBVI are plotted with error bars in z a . We used 82 out of 90 galaxies with available 
spectroscopic redshifts; Qthe eight error bars without points in Fig.l denote the galaxies not 
used in determining the z a relations. These eight outlying galaxies have a mean dispersion 
of a z ~ 0.45 between their estimated redshifts z a and their spectroscopic redshifts z; they 
mostly lie near the boundaries of the five color ranges. A table listing these eight galaxies 
is available by anonymous ftp in the elt/:HDF subdirectory of astro.princeton.edu. 

It is apparent from Fig.l that the above simple relations provide good estimates of 
the galaxy redshifts. The constant offsets in Eqs.(^)-(10) represent a rough indicator of 



the mean galaxy redshift in a given color range; the larger offsets represent higher redshift 
galaxies. The linear color-redshift relation used above is considerably simpler than the 
higher-order polynomial fit used by Connolly et al. (1998), and has a smaller number of 



l We do not count the three z > 2 galaxies for which the published spectroscopic redshifts 
are erroneous or very uncertain (M. Sawicki 1998, private communication). 
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free parameters; it also yields a smaller dispersion (a z ~ 0.034 versus 0.097) for one of the 
z < 2 color-ranges (see Eq.@). 

In applying our formulae to the HDF photometric catalog, we first use Eqs.(|l])-(|3|) 
to select z > 2 galaxies. In estimating color redshifts for z < 2 galaxies we then use 
Eqs.(||)-(||), and for z > 2 redshifts we use Eq.(|)-([T(J). In the next section, we compare 
these estimated color redshifts with those obtained from spectral template fitting. 



3. Comparison With the Template-fitting Method 

Several groups (e.g. Lanzetta et. al. 1996| , Gwyn fc Hartwick 1996 , [Sawicki et. al. 



1997]) have obtained photometric redshifts for the HDF galaxies by comparing the observed 
UBVI magnitudes with a set of galaxy templates of different spectral types redshifted to 
evenly spaced redshifts. The photometric redshifts obtained by these groups are generally 
consistent with each other, although large differences exist for some galaxies flSawicki 1997] ) . 

The color-redshift relations derived in the present paper (§2) yield a considerably 
smaller dispersion between the estimated and spectroscopic redshifts than the template- 
fitting method; this occurs because our method explicitly minimizes the dispersion for each 
color range of galaxies with measured redshifts which are used in fitting the color-redshift 
relations. A comparison of our predicted color redshifts, z a , with the photometric redshifts 
from template-fitting by Sawicki et. al. (1997), z temp , is presented in Fig. 2 for the 848 
HDF galaxies with I < 27 and with measured UBVI magnitudes. The solid diagonal 
line in Fig. 2 indicates z a = z tem p', the two dotted diagonal lines mark the region in which 
\z a — z t emp\ < 0.5. The results show that the two estimators, the simple color redshift 
estimator and the template-fitting redshift estimator, are generally consistent with each 
other, with some outlyers. About 90% of all galaxies lie within \z a — z temp \ < 0.5. From 
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the 10% that lie outside this region (83 out of 848 galaxies), nearly half lie close to the 
region's boundary. The dozen discordant redshifts with z temp ~ 2 and z a < 0.5 (Fig. 2) 
are probably due to the gap in the available HDF spectroscopic redshifts at z ~ 2; thus 
galaxies with true redshifts of z ~ 2 may have been assigned wrong redshifts by the best-fit 
analytic formulae. The analytic formulae presented above can be improved as spectroscopic 
redshifts are measured for more galaxies in the HDF (especially in the missing redshift 
range of 1.4 < z < 2.2). 

The agreement between the analytic color redshifts we obtain and the spectral 
template-fitting photometric redshifts obtained by Sawicki et al. (1997) is comparable to 
the consistency among the various methods utilizing the spectral template-fitting technique. 
This illustrates that the linear analytic relations based on U BVI colors provide a good and 
easy method for estimating galaxy redshifts. In fact, its dispersion in the different color 
ranges is lower than given by the template-fitting methods. 



4. Estimated Redshift Catalog of HDF Galaxies 

We have calculated the estimated redshifts z a of the HDF galaxies (848 galaxies with 
I < 27 and with measured UBVI nagnitudes) based on the color relations given above (§2). 
We present these redshifts in a catalog that is available by anonymous ftp in the elt/:HDF 
subdirectory of astro.princeton.edu. 

The Estimated Redshift Catalog of HDF Galaxies includes the following information 
for each galaxy: galaxy identification number; x and y pixel positions on the v2 HDF image; 
UBVI magnitudes; our color redshift estimate, z a (based on Eq.(|6|)-([10|)); photometric 
redshift from the template-fitting method by Sawicki et al, z temp ; and, when available, the 
observed spectroscopic redshift, z. The color range number of each galaxy, cr (§2), is also 
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listed in the catalog. Hard copies of the catalog are available upon request. 

5. Redshift Clustering 

We use the estimated color redshifts of the HDF galaxies to investigate the large-scale 
redshift clustering of galaxies. The small a z dispersion found for some of the color ranges 
provides the possibility of detecting large-scale clustering among the distant galaxies. 

We use the estimated redshifts z a for the 848 galaxies with I < 27 from the Estimated 
Redshift Catalog of HDF Galaxies (§4). We find that most of the galaxies satisfy either 
(U - B) < {B - V) - 0.1 (230 galaxies) or (U - B) > {B - V) - 0.1 < V - I (333 galaxies). 
Since the number of galaxies is relatively large and the dispersion between z a and z is 
relatively small [a z = 0.034 and a z = 0.097 respectively], we use these two groups to study 
the redshift distribution. 

Fig.3 presents the estimated redshift distribution [using Eq.(^)] of 230 HDF galaxies 
with (U - B) < (B — V) — 0.1 and I < 27, for a bin size of Az a = 0.03. The redshift 
distribution suggests the existence of peaks that indicate large-scale clustering of galaxies to 
z ~ 1. To estimate the significance of the observed structures in the redshift distribution, 
we calculate the expected smoothed distribution by convolving the observed redshift 
distribution with a Gaussian of width a r z = 0.1. The dashed and dotted lines in Fig.3 
represent the mean smoothed distribution and the 1-cr contour respectively, for a T z = 0.1 and 
10 4 realizations of the smoothed distribution. Most of the observed peaks are marginally 
significant at levels of 1 to 3a above the smoothed distribution. These peaks are consistent 
with the peaks revealed by spectroscopic observations of galaxies to z < 0.8 in this region 
( |Cohen et. al. 1996|) ; the location of the spectroscopic peaks are marked by the arrows 
on top of Fig.3. We see that the location of our suggested peaks are consistent with those 
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seen directly with a smaller number of spectroscopically measured redshifts. Fig. 3 suggests 
an additional peak at z ~ 0.8 that is not yet confirmed by spectroscopic data. The peaks 
are seen consistently in different parts of the HDF field, with no evidence of sub-clustering 
on the sky. These peaks suggest large-scale clustering in the galaxy distribution at high 
redshifts; they may represent the distant (z ~ 1) counterpart to local superclusters, or 
walls, seen at low redshifts ( [Broadhurst et al. 19901 |Bahcall 1991[ ). Most recently, such 



peaks have also been seen at z ~ 3 ([Steidel et al. 199"B"D- 



Fig. 4 presents the redshift distribution of 333 HDF galaxies with (U — B) > 
(B — V) — 0.1 < V — I and I < 27, for a bin size of Az a = 0.09. (For this group, the redshift 
dispersion is a z = 0.097). The dashed and the dotted lines in Fig.4 represent the mean 
smoothed distribution and the 1— a contour respectively, for a r z = 0.09 and 10 3 realizations 
of the smoothed distribution. A peak is suggested at z a ~ 1 and possibly at z ~ 1.3, but 
the large dispersion for this color sub-sample appears to "wash-out" any other significant 
underlying peaks. 



6. Summary and Discussion 

Using HDF photometric and spectroscopic data, we have determined a set of simple 
analytic formulae that yield estimated galaxy redshifts to z < 4 in terms of linear 
combinations of three measured colors, U — B, B — V, and V — I (Eqs.(|6|)-([T0|)). The derived 
analytic formulae in five color ranges exhibit small dispersions between the estimated and 
spectroscopic redshifts. For z < 2 galaxies, the redshift dispersion ranges from a z = 0.034 
to a z = 0.097 for different color ranges. For z > 2 galaxies, we find a z = 0.14 and a z = 0.36 
for two color ranges which typically represent z > 3 and z < 3 galaxies respectively. These 
color-redshift relations apply to about 90in the sample. 
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The smallest dispersion between the color and the spectroscopic redshifts, a z = 0.034, 
occurs for the z < 2 galaxies satisfying (U — B) < (B — V) — 0.1; 28 galaxies with measured 
redshifts are used in deriving the relation for the estimated redshift, with only 4 free 
parameters (the coefficients in Eq.(|6|)). There are 230 HDF galaxies with I < 27 and 
measured UBVI magnitudes that belong to this color range; we investigate the large-scale 
redshift distribution of these galaxies and find evidence for peaks in the redshift distribution 
that suggest large-scale clustering to at least z ~ 1. These results are consistent with 
those of Cohen et. al. (1996) using observed spectroscopic redshifts of a smaller number of 
galaxies. 

We have applied our color redshift formulae to the entire HDF photometric catalog 
and find that the derived redshifts are consistent with those obtained from spectral 
template-fitting techniques. The analytic relations, by design, yield lower dispersion than 
the template-fitting method. The color-redshift relations have the advantage of being 
simple, model independent, and easy to use. They can be further improved with additional 
data. These analytic color-redshift estimators are useful in providing empirical estimates of 
galaxy redshifts to z < 4 using multiband photometry. 

Our Estimated Redshift Catalog of HDF Galaxies, based on our color redshift formulae 
for all 848 HDF galaxies with I < 27 and measured UBVI fluxes, is available by anonymous 
ftp in the elt/:HDF subdirectory of astro.princeton.edu. 

Note that our color-redshift relations (Eqs.(^)- (|T0|) ) are derived using AB magnitudes 
and for the HDF filters. For application to other photometric catalogs, the appropriate 
spectroscopic training set should be used; when such a training set is not available, 
Eqs.(|6|)-(|i~0D may provide useful estimates after appropriate photometric transformation 
has been performed between the different filter systems. Also note that these color-redshift 
relations should not be applied to galaxies which lie close to the boundaries of the color 
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ranges. 

Finally, we note that our color-redshift relations are limited by the absence of measured 
spectroscopic redshifts for galaxies in the range of 1.4 < z < 2.2 (see Fig.l and Fig. 2). It 
is very important to obtain spectroscopic redshifts in this range, because it will not only 
enable better calibration of photometric redshifts, it will also help us understand the nature 
of galaxies in the intermediate redshift range. 
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Fig. 1. — Color-based analytic redshift estimate z a (given by Eqs.(|6D-(|T0D) versus the 
spectroscopic redshift z for 90 HDF galaxies. The galaxies with known measurement errors 
in UBVI are plotted with error bars in z a . 



Fig. 2. — Color-based analytic redshift estimate z a (given by Eqs.QB])-(|iT)|)) versus the Sawicki 
et. al. (1997) template-fitting photometric redshift z tem p, for 848 galaxies in the HDF with 
I < 27 and measured UBVI. The symbols are the same as in Fig.l. The solid diagonal line 
indicates z a = z temp ; the dotted lines mark the region \z a — z temp \ < 0.5. 

Fig. 3. — The estimated redshift (z a ) distribution of 230 galaxies with (U—B) < (B — V)— 0.1 
and / < 27 in the HDF, for a bin size of Az a = 0.03. The dashed and the dotted lines 
represent the mean smoothed distribution and the 1— a contour respectively, for al = 0.1 
and 10 4 realizations of the smoothed distribution (§5). The arrows on the top of the figure 
indicate the location of the peaks to z ~ 0.8 observed from spectroscopic redshifts of a 
smaller number of galaxies by Cohen et al. (1996). 

Fig. 4. — The estimated redshift (z a ) distribution of 333 galaxies with (U — B) > 
(B - V) - 0.1 < V — I and J < 27 in the HDF, for a bin size of Az a = 0.09. The 
dashed and the dotted lines represent the mean smoothed distribution and the 1— a contour 
respectively, for a r z = 0.09 and 10 3 realizations of the smoothed distribution (§5). 
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Fig. 1. — Color-based analytic redshift estimate z a (given by Eqs.flBP-fllOD) versus the 
spectroscopic redshift z for 90 HDF galaxies. The galaxies with known measurement errors 
in UBVI are plotted with error bars in z a . 
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Fig. 2. — Color-based analytic redshift estimate z a (given by Eqs.(|^)-([T0|)) versus the Sawicki 
et. al. (1997) template-fitting photometric redshift z tem p, for 848 galaxies in the HDF with 
I < 27 and measured UBVI. The symbols are the same as in Fig.l. The solid diagonal line 
indicates z a = z temp ; the dotted lines mark the region \z a — z temp \ < 0.5. 
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Fig. 3.— The estimated redshift (z a ) distribution of 230 galaxies with (U-B) < (B-V)-O.l 
and / < 27 in the HDF, for a bin size of Az a = 0.03. The dashed and the dotted lines 
represent the mean smoothed distribution and the 1— a contour respectively, for a r z = 0.1 
and 10 4 realizations of the smoothed distribution (§5). The arrows on the top of the figure 
indicate the location of the peaks to z ~ 0.8 observed from spectroscopic redshifts of a 
smaller number of galaxies by Cohen et al. (1996). 
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Fig. 4. — The estimated redshift (z a ) distribution of 333 galaxies with (U — B) > 
(B - V) - 0.1 < V - I and I < 27 in the HDF, for a bin size of Az a = 0.09. The 
dashed and the dotted lines represent the mean smoothed distribution and the 1— a contour 
respectively, for a r z = 0.09 and 10 3 realizations of the smoothed distribution (§5). 



